准确的睡眠阶段分类对于睡眠健康评估很重要。近年来,已经开发了几种基于深度学习和机器学习的睡眠阶段算法,并且在人类注释方面取得了表现。尽管性能提高,但最深入学习算法的局限性是其黑盒行为,它限制了它们在临床环境中的使用。在这里,我们提出了跨模式变压器,这是一种基于变压器的睡眠阶段分类的方法。我们的模型通过最先进的方法实现了竞争性能,并通过利用注意模块的可解释性方面消除了深度学习模型的黑盒行为。提出的跨模式变压器由一种新型的跨模式变压器编码器结构以及多尺度的一维卷积神经网络组成,用于自动表示学习。基于此设计的我们的睡眠阶段分类器能够以与最先进的方法相同或更好地达到睡眠阶段分类性能,以及可解释性,参数数量减少了四倍,并且比较培训时间减少了。到当前的最新。我们的代码可从https://github.com/jathurshan0330/cross-modal-transformer获得。
translated by 谷歌翻译
我们提供了证据表明,学到的密度功能理论(``dft')的力场已准备好进行基态催化剂发现。我们的关键发现是,尽管预测的力与地面真相有很大差异,但使用从超过50 \%的评估系统中使用RPBE功能的能量与使用RPBE功能相似或较低能量的力量的力量与使用RPBE功能相似或较低的力量放松。这具有令人惊讶的含义,即学习的潜力可能已经准备好在挑战性的催化系统中替换DFT,例如在Open Catalyst 2020数据集中发现的电位。此外,我们表明,在局部谐波能量表面上具有与目标DFT能量相同的局部谐波能量表面训练的力场也能够在50 \%的情况下找到较低或相似的能量结构。与在真实能量和力量训练的标准模型相比,这种``简易电位''的收敛步骤更少,这进一步加速了计算。它的成功说明了一个关键:即使模型具有高力误差,学到的电位也可以定位能量最小值。结构优化的主要要求仅仅是学到的电位具有正确的最小值。由于学到的电位与系统大小的速度快速且尺寸为线性,因此我们的结果开辟了快速找到大型系统基础状态的可能性。
translated by 谷歌翻译
贝叶斯优化(BO)方法试图找到目标功能的全球最佳功能,这些功能仅作为黑盒或昂贵的评估。这样的方法为目标函数构建了替代模型,从而量化了通过贝叶斯推论的替代物中的不确定性。客观评估是通过在每个步骤中最大化采集函数来依次确定的。但是,由于采集函数的非转换性,尤其是在批处理贝叶斯优化的情况下,该辅助优化问题可能是高度不平凡的,因此可以解决。在这项工作中,我们将批处理重新定义为在概率措施空间上的优化问题。我们基于多点预期改进来构建一个新的采集函数,该功能是概率度量空间的凸面。解决此“内部”优化问题的实用方案自然会作为该目标函数的梯度流。我们证明了这种新方法对不同基准函数的功效,并与最先进的批次BO方法进行了比较。
translated by 谷歌翻译
设计在边缘硬件上运行的深神经网络(DNN)仍然是一个挑战。社区已经采用了标准设计来促进神经网络模型的部署。但是,并不是很强调适应网络拓扑以适合硬件约束。在本文中,我们适应了移动硬件平台MobilenetV2的最广泛使用的架构之一,并研究了更改其拓扑结构并应用后培训后量化的影响。我们讨论了改编和模型在嵌入式硬件平台上进行面部检测的影响。
translated by 谷歌翻译
本文有助于加强机器学习与微分方程理论之间的关系。在这种情况下,拟合参数的逆问题,而微分方程与某些测量值的初始条件构成了关键问题。本文探讨了一个可以用于构建损失函数家族的抽象,目的是将初始值问题解决方案拟合到一组离散或连续测量中。可以证明,伴随方程的扩展可以用来推导损失函数的梯度,作为机器学习中反向传播的连续类似物。提供了数值证据,表明在合理控制的情况下,获得的梯度可以在梯度下降中使用,以将初始值问题解决方案拟合到一组连续的嘈杂测量值中,以及一组离散的噪声测量值,这些测量值在不确定的情况下记录下来时代。
translated by 谷歌翻译
语言模型既展示了定量的改进,又展示了新的定性功能,随着规模的增加。尽管它们具有潜在的变革性影响,但这些新能力的特征却很差。为了为未来的研究提供信息,为破坏性的新模型能力做准备,并改善社会有害的效果,至关重要的是,我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战,我们介绍了超越模仿游戏基准(Big Bench)。 Big Bench目前由204个任务组成,由132家机构的442位作者贡献。任务主题是多样的,从语言学,儿童发展,数学,常识性推理,生物学,物理学,社会偏见,软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号,Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为,跨越了数百万到数十亿个参数。此外,一个人类专家评估者团队执行了所有任务,以提供强大的基准。研究结果包括:模型性能和校准都随规模改善,但绝对的术语(以及与评估者的性能相比);在模型类中的性能非常相似,尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分,而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标;社交偏见通常会随着含糊不清的环境而随着规模而增加,但这可以通过提示来改善。
translated by 谷歌翻译
语音神经调节物有可能为患有扰动或休闲症的人提供沟通。最近的进展已经证明了从放置在皮质表面上的电加电网的高质量文本解码和语音合成。在这里,我们研究了较少的侵入性测量模态,即立体定向脑电图(SEEG),其提供来自多个脑区的稀疏抽样,包括皮质区域。为了评估Seeg是否也可用于综合神经录音的高质量音频,我们采用了一种基于现代深度学习方法的经常性编码器 - 解码器框架。我们证明,尽管有限的训练数据,但是可以从这些微创录音来重建高质量的言论。最后,我们利用变分特征丢失来成功识别最具信息丰富的电极触点。
translated by 谷歌翻译
While the brain connectivity network can inform the understanding and diagnosis of developmental dyslexia, its cause-effect relationships have not yet enough been examined. Employing electroencephalography signals and band-limited white noise stimulus at 4.8 Hz (prosodic-syllabic frequency), we measure the phase Granger causalities among channels to identify differences between dyslexic learners and controls, thereby proposing a method to calculate directional connectivity. As causal relationships run in both directions, we explore three scenarios, namely channels' activity as sources, as sinks, and in total. Our proposed method can be used for both classification and exploratory analysis. In all scenarios, we find confirmation of the established right-lateralized Theta sampling network anomaly, in line with the temporal sampling framework's assumption of oscillatory differences in the Theta and Gamma bands. Further, we show that this anomaly primarily occurs in the causal relationships of channels acting as sinks, where it is significantly more pronounced than when only total activity is observed. In the sink scenario, our classifier obtains 0.84 and 0.88 accuracy and 0.87 and 0.93 AUC for the Theta and Gamma bands, respectively.
translated by 谷歌翻译
New architecture GPUs like A100 are now equipped with multi-instance GPU (MIG) technology, which allows the GPU to be partitioned into multiple small, isolated instances. This technology provides more flexibility for users to support both deep learning training and inference workloads, but efficiently utilizing it can still be challenging. The vision of this paper is to provide a more comprehensive and practical benchmark study for MIG in order to eliminate the need for tedious manual benchmarking and tuning efforts. To achieve this vision, the paper presents MIGPerf, an open-source tool that streamlines the benchmark study for MIG. Using MIGPerf, the authors conduct a series of experiments, including deep learning training and inference characterization on MIG, GPU sharing characterization, and framework compatibility with MIG. The results of these experiments provide new insights and guidance for users to effectively employ MIG, and lay the foundation for further research on the orchestration of hybrid training and inference workloads on MIGs. The code and results are released on https://github.com/MLSysOps/MIGProfiler. This work is still in progress and more results will be published soon.
translated by 谷歌翻译
There are multiple scales of abstraction from which we can describe the same image, depending on whether we are focusing on fine-grained details or a more global attribute of the image. In brain mapping, learning to automatically parse images to build representations of both small-scale features (e.g., the presence of cells or blood vessels) and global properties of an image (e.g., which brain region the image comes from) is a crucial and open challenge. However, most existing datasets and benchmarks for neuroanatomy consider only a single downstream task at a time. To bridge this gap, we introduce a new dataset, annotations, and multiple downstream tasks that provide diverse ways to readout information about brain structure and architecture from the same image. Our multi-task neuroimaging benchmark (MTNeuro) is built on volumetric, micrometer-resolution X-ray microtomography images spanning a large thalamocortical section of mouse brain, encompassing multiple cortical and subcortical regions. We generated a number of different prediction challenges and evaluated several supervised and self-supervised models for brain-region prediction and pixel-level semantic segmentation of microstructures. Our experiments not only highlight the rich heterogeneity of this dataset, but also provide insights into how self-supervised approaches can be used to learn representations that capture multiple attributes of a single image and perform well on a variety of downstream tasks. Datasets, code, and pre-trained baseline models are provided at: https://mtneuro.github.io/ .
translated by 谷歌翻译